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19.  ABSTRACT  (continued) 

image  processing  on  the  image.  The  main  issue  is  to 
examine  the  effectiveness  of  this  technique  applied  to 
noisy  infrared  images  from  uncooled  focal  plane  array 
sensor  having  unimodal  distributions.  The  technique  was 
able  to  extract  the  target  in  the  image,  producing  a 
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studied.  A  target  which  was  fragmented  into  several  parts 
because  of  the  noise  is  not  detectable.  The  technique 
could  be  implemented  in  hardware  and  applied  to  the  inputs 
of  a  classification  system  for  detectable  objects  in  noisy 
infrared  images. 
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ABSTRACT 


-Image  segmentation  is  an  essential  preliminary  step  in 
automatic  pictorial  pattern  recognition  and  scene  analysis 
problems.  The  objective  of  segmentation  techniques  is  to 
partition  an  image  into  regions  or  components.  The  purpose 
of  this  thesis  is  to  analyze  a  segmentation  technique  called 
gradient  relaxation.  The  gradient  relaxation  method  is  a 
viable  method  in  segmenting  objects  within  an  image.  The 
gradient  relaxation  technique  is  applicable  to  images  having 
unimodal  distributions.  This  method  is  applied  to  noisy 
infrared  images  in  an  attempt  to  detect  and  classify  the 
target.  The  method  allows  for  an  easy  selection  of  a 
threshold  value  which  may  be  required  for  other  types  of 
image  processing  on  the  image.  The  main  issue  is  to  examine 
the  effectiveness  of  this  technique  applied  to  noisy 
infrared  images  from  uncooled  focal  plane  array  sensor 
having  unimodal  distributions.  The  technique  was  able  to 
extract  the  target  in  the  image,  producing  a  homogeneous  and 
uniform  region  for  most  of  the  cases  studied.  A  target 
which  was  fragmented  into  several  parts  because  of  the  noise 
is  not  detectable.  The  technique  could  be  implemented  in 
hardware  and  applied  to  the  inputs  of  a  classification 
system  for  detectable  objects  in  noisy  infrared  images. 
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image  sensors.  Information  extraction  involves  the 


detection  and  recognition  of  patterns  within  the  image. 


The  human  eye  has  an  extraordinary  pattern  recognition 


capability,  being  able  to  discern  approximately  one  hundred 


shades  of  gray.  However,  the  eye  is  not  always  able  to 


extract  all  the  information  from  an  image  due  to  radiometric 


degradation,  geometric  distortion,  and  noise  introduced 


during  recording,  transmission,  and  display  of  the  images. 


These  factors  can  severely  limit  recognition  of  patterns  or 


objects.  One  purpose  of  image  processing  is  to  aid  the 


human  eye  in  extracting  the  desired  image  by  removing  these 


distortions . 


Three  methods  are  available  in  performing  image 


processing  operations:  digital,  optical,  and  photographic. 


Black  and  white  film  can  retain  a  limited  range  of  gray 


level  intensities  (50  or  less),  whereas  digital  computers 


can  represent  several  hundreds  or  thousands  of  gray  levels. 


[Ref.  1]  Optical  methods  are  faster,  but  do  not  offer  the 


flexibility  of  digital  methods.  Flexibility  is  limited  by 
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such  factors  as  the  compromise  between  computation  time  and 
the  accuracy  of  the  results.  [Ref.  2]  Computers  can  be  used 
to  apply  various  linear  and  nonlinear  transformations  to 
images  which  cannot  be  performed  optically.  Digital 
information  extraction  techniques  can  fully  exploit  the 
statistical  nature  of  digital  imagery.  These  techniques  can 
also  be  used  for  analysis  based  on  correlation  of  image  data 
with  nonimaging  data.  This  includes  correlation  of  remotely 
sensed  imagery  with  nonimaging  georef erenced  cartographical 
data  bases. 

The  digital  computer/  used  in  numerically  oriented 
analysis  because  of  its  quantitative  character  and  great 
speed,  has  become  a  key  tool.  Numerically  oriented  remote 
sensing  takes  advantage  of  the  computer  to  emphasize  the 
inherently  quantitative  aspects  of  the  image  data,  dealing 
with  the  data  rather  abstractly  as  a  collection  of 
measurements  rather  than  as  an  image.  Tremendous  quantities 
of  data  are  of  real  value  only  when  the  data  can  be  acquired 
and  analyzed  both  rapidly  and  cost  effectively.  The  growth 
of  digital  computer  technology  has  enabled  the  development 
of  digital  image  processing  techniques.  Because  of  faster 
and  cheaper  computational  components,  large-capacity  high- 
density  digital  data  storage  devices,  and  improved  display 
technology,  the  processing,  manipulation,  and  display  of 
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large  volumes  of  digital  imagery  has  become  possible. 
[Ref.  3] 


A  digital  image  processing  system  contains  three  main 
elements  as  shown  in  Figure  1.1  and  are  defined  as  follows: 


Figure  1.1:  Image  Processing  System 


(1)  Image  Acquisition.  This  involves  the  conversion  of  a 
scene  into  a  digital  representation.  This  element 
can  be  performed  by  a  sensor  system  which  is  designed 
to  view  a  scene  and  provide  a  digital  representation 
of  it.  The  acquisition  involves  the  conversion  of  an 
image  from  a  television  signal  or  film  into  a  digital 
representation.  [Ref.  3]  An  image  sensor  can  be 
characterized  by  a  number  of  features,  including: 

•  signal-to-noise  ratio  -  a  measure  of  the  useful 
information  extracted  from  the  sensor's  signal? 


•  dynamic  range  -  variation  in  the  range  of  the 
response  to  light  energy; 

•  resolution  -  measure  of  the  smallest  detail  in 
the  image  which  can  be  retained  by  the  sensor; 

•  transfer  function  -  relationship  between  incoming 
light  spatial  frequency  and  output  spatial 
frequency; 

•  integration  time  -  the  time  in  which  the  sensor 
accumulates  charges  generated  by  the  incoming 
light ; 

•  reading  speed  -  the  scanning  time  for  a  given 
total  spatial  resolution  and  picture  size; 

•  spectral  sensitivity  -  the  portion  of  the 
electromagnetic  spectrum  to  be  used  by  the 
sensor.  [Ref.  4] 

The  sophistication  of  the  acquisition  system 
based  on  the  above  features  and  capabilities  will 
greatly  affect  the  cost,  performance,  and  reliability 
of  the  acquisition  system.  However,  no  matter  how 
sophisticated  the  system  is,  certain  degradations 
will  be  introduced  into  the  image.  These 
degradations  fall  into  two  categories:  radiometric 
and  geometric  distortions.  Radiometric  degradations 
occur  from  blurring  affects  of  the  imaging  system, 
nonlinear  amplitude  responses,  shading,  transmission 
noise,  atmospheric  interference  (scattering, 
attenuation,  haze),  variable  surface  illumination 
(differences  in  terrain  slope  and  orientation),  and 
change  of  terrain  radiance  with  viewing  angle. 
Geometric  distortions  can  be  categorized  into  three 
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categories:  sensor-related  such  as  aberrations  in 
the  optical  system,  or  nonlinearities  and  noise  in 
the  scan-deflection  system;  sensor-platform  related 
caused  by  attitude  and  altitude  of  the  sensor;  and 
object-related  distortions  caused  by  Earth  rotation 
and  curvature,  and  terrain  relief.  [Ref.  1] 

Image  Processing.  This  element  provides  the  digital 
processing  of  the  image  or  images  to  produce  a 
desired  result  (Figure  1.2).  This  processing  can 
range  from  simple  enhancement  of  an  image  for  better 
display  of  scene  detail  to  more  complex  processing 
involving  several  component  images.  [Ref.  3]  Digital 
image  processing  techniques  can  be  divided  into  two 
different  groups.  The  first  group  includes 
quantitative  restoration  of  images  to  correct  for 
degradation  and  noise,  registration  for  overlaying 
and  mosaicing,  and  subjective  enhancement  of  image 
features  for  interpretation.  The  second  group  is 
concerned  with  the  extraction  of  information  from  the 
images.  This  area  of  analysis  includes  object 
detection,  segmentation  of  images  into 
characteristically  different  regions,  and 
determination  of  structural  relationships  among  the 
regions.  [Ref.  1]  Within  these  two  groups  fall  two 
categories:  subjective  and  quantitative  processing. 


Figure  1.2:  Image  Processing  Steps.  [Ref.  1] 

Subjective  processing  is  usually  performed  in  an 
adaptive,  interactive,  and  iterative  manner.  It  is  a 
trial  and  error  process,  and  success  is  based  on  the 
ability  of  the  observer  to  detect  information  of 
interest  in  the  final  or  enhanced  image.  The  changes 
achieved  in  the  'before'  and  'after'  versions  of  the 
images  processed  subjectively  are  often  quite 
dramatic,  despite  the  relative  computational 
simplicity  of  many  of  the  subjective  techniques.  A 
basic  tool  which  is  used  in  performing  subjective 
enhancement  and  image  analysis  is  the  histogram.  The 
histogram  reveals  the  distribution  of  the  intensities 
within  the  image?  it  is  represented  graphically  as  a 
plot  of  the  number  of  picture  elements  (pixels)  at  a 
given  intensity,  versus  the  gray  level  intensity. 
Quantitative  techniques  are  generally  performed  on  an 
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image  in  a  nonadaptive,  noninteractive  manner.  The 
processing  method  is  based  on  a  predefined 
mathematical  algorithm,  and  success  in  processing  is 
based  on  the  correctness  of  the  model.  Examples  of 
qualitative  processing  is  the  removal  of  radiometric 
and  geometric  distortions.  [Ref.  3] 

This  element  can  reduce  some  of  the  requirements 
of  the  image  acquisition  system,  such  as  signal  to 
noise  ratio,  dynamic  range,  transfer  function, 
integration  time,  and  reading  speed.  By  reducing 
some  of  the  requirements,  the  cost  of  the  acquisition 
system  can  be  reduced,  and  the  money  saved  can  be 
used  to  improve  the  processing  capabilities  of  the 
complete  imaging  system. 

(3)  Image  Display.  The  final  element  provides  for 
generation  of  an  output  product  that  can  be  seen  by  a 
human  observer.  This  element  provides  the  required 
conversion  of  digital  data  into  an  analog  form. 


Processed  images  can  be  viewed  on  a  volatile  display 
monitor  that  presents  the  digitized  data  in  an  analog 
form  (video  signal).  The  imagery  data  can  be 
recorded  on  film  or  other  hard  copy  format.  [Ref.  3] 


B.  OVERVIEW 

This  thesis  is  concerned  with  the  image  processing  step 
and  specifically  with  image  analysis  using  segmentation 
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techniques.  The  segmentation  technique  used  here  is  called 
the  gradient  relaxation  method.  This  method  utilized  an 
iterative  probability  adjustment  process  to  segment  pixels 
into  two  regions,  'light'  and  'dark'.  This  method  is  highly 
dependent  on  the  selection  of  weighting  factors.  They 
determine  the  speed  at  which  the  segmentation  process 
coverages  and  to  regions  pixels  will  be  assigned.  Analysis 
is  done  on  noisy  infrared  images  of  ships  to  determine  if 
targets  can  be  detected,  and/or  classified.  Detection  is 
the  ability  of  the  observer  to  sense  that  an  object  of 
interest  is  in  the  field  of  view.  Classification  is  defined 
(in  the  military  sense)  as  the  ability  of  the  observer  to 
identify  the  detected  object  as  to  its  type.  For  Army 
operations,  classification  could  be  a  tank,  truck,  or 
helicopter.  For  Naval  operations,  a  large  ship,  small  ship, 
combatant,  or  merchant  vessel  would  be  typical  types.  At 
different  steps  of  engagement,  the  need  to  detect  or  to 
classify  the  object  will  depend  upon  the  situation. 

Chapter  II  is  a  survey  of  contemporary  image 
segmentation  techniques.  These  techniques  are  classified 
into  three  categories:  characteristic  feature  thresholding, 
edge  detection,  and  region  extraction.  The  specific 
algorithm  which  is  investigated  in  this  thesis  is  a 
combination  of  feature  thresholding  and  region  extraction, 
using  a  relaxation  or  iterative  process  for  the  segmentation 


of  the  image.  Chapter  III  is  a  discussion  on  the  gradient 
relaxation  algorithm,  the  particular  method  used  in  this 
investigation.  This  chapter  introduces  the  relaxation 
process  and  develops  the  gradient  relaxation  algorithm. 
This  algorithm  is  applied  to  several  noisy  infrared  images 
of  ships  in  Chapter  IV.  An  analysis  is  done  on  how 
effective  the  algorithm  is  in  reducing  or  eliminating  noise, 
the  ability  to  detect  and  classify  an  object  in  the  field  of 
view.  The  final  chapter  summarized  the  results,  discusses 
possible  applications,  implementation  of  the  algorithm,  and 
possible  future  work. 
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II .  TECHNIQUES  USED  IN  SEGMENTATION 

A.  SEGMENTATION  BASICS 

A  major  branch  of  image  processing  deals  with  image 
analysis  or  scene  analysis,  where  the  input  is  pictorial, 
but  the  output  is  a  description  of  the  given  picture  or 
scene.  The  following  are  examples  of  image  analysis 
problems : 

(1)  The  input  is  text  and  it  is  desired  to  read  the  text; 
here  the  description  of  the  input  consists  of  a 
sequence  characters. 

(2)  The  input  is  a  nuclear  bubble  chamber  picture,  and  it 
is  desired  to  detect  and  locate  certain  events  (e.g., 
particle  collisions);  the  description  consists  of  a 
set  of  coordinates  and  names  of  event  types. 

(3)  The  input  is  a  picture  of  a  miotic  cell  and  the 
output  is  a  'map'  showing  the  arrangement  of  the 
chromosomes  in  a  standard  order.  This  output 
requires  knowledge  of  the  location  and  identification 
of  the  chromosomes. 

(4)  The  input  is  an  aerial  photograph  of  terrain  with  the 
desired  output  being  a  map  showing  specific  types  of 
terrain  feature  (vegetation,  buildings,  ships,  roads, 
etc.).  The  construction  of  this  output  also  requires 


the  location  and  identification  of  the  desired 


terrain  features.  [Ref.  5] 

In  all  of  these  examples,  the  description  refers  to 
specific  parts  or  objects  in  the  picture  in  terms  of  their 
properties  and  the  relationships  between  the  objects.  Image 
analysis  consists  of  four  steps: 

Step  1:  Segmentation  -  This  is  the  partitioning  of  an 
image  into  different  regions,  each  having 
different  properties. 

Step  2:  Regional  descriptions  -  This  procedure  is  used  to 
characterize  the  segmented  regions  by  a  set  of 
descriptors  which  are  not  sensitive  to  such 
variations  as  changes  in  size,  rotation,  or 
translation.  These  descriptors  will  bring  out 
features  which  will  aid  in  differentiating 
regions  with  different  attributes. 

Step  3:  Relational  descriptions  -  This  procedure  deals 
with  the  organization  of  these  regions  into  a 
meaningful  structure. 

Step  4:  Descriptions  of  similarity  -  The  final  step  deals 
with  the  problem  of  establishing  measures  of 
similarity  between  regions  in  an  image.  [Ref.  6] 
Image  segmentation  is  a  critical  step  in  the  image 
analysis  process  because  errors  in  segmentation  might 
propagate  through  the  other  processes  producing  an  incorrect 


description  of  the  scene.  The  question  can  then  be  asked. 


what  should  a  good  image  segmentation  be?  Regions  of  an 


image  segmentation  should  be  uniform  and  homogeneous  with 


respect  to  some  characteristic  such  as  gray  level  or 


texture.  Region  interiors  should  be  simple  and  contain  few 


gaps  or  holes.  Adjacent  regions  of  a  segmented  image  should 
be  significantly  different  in  value  with  respect  to  the 


characteristic  on  which  the  regions  are  homogeneous. 


Boundaries  of  each  region  should  be  smooth  and  spatially 


accurate.  Achieving  these  desired  properties  is  difficult 


because  precisely  uniform  and  homogeneous  regions  are 


typically  full  of  small  holes  and  have  jagged  boundaries. 
Requiring  that  adjacent  regions  have  a  large  difference  in 


value  can  cause  regions  to  merge  and/or  boundaries  to  be 


lost.  All  of  these  effects  introduce  errors  which  are 


undesirable.  [Ref.  7] 


There  is  neither  a  standard  approach  to  nor  theory  for 
of  image  segmentation.  Segmentation  techniques  are 


basically  ad-hoc  and  differ  in  the  way  each  emphasizes  one 


or  more  of  the  properties  discussed  previously.  In  the  way 
each  strikes  a  balance  between  one  desired  property  and 


another  property.  T.  Pavlidis  has  commented  that  an  image 


segmentation  problem  is  basically  one  of  psychophysical 


perception  and  therefore  not  susceptible  to  a  purely 


analytical  solution.  Any  mathematical  algorithm  must  be 


supplemented  with  heuristics,  involving  semantics  about  the 
class  of  images  under  consideration.  Quite  often,  simple 
heuristics  are  not  enough,  and  it  is  essential  to  introduce 
a  priori  knowledge  about  the  image.  An  example  of  this  is 
the  dalmatian  dog  picture  (Figure  2.1).  Without  the  priori 
knowledge  that  a  picture  consists  of  a  dalmatian  dog,  most 
human  observers  would  perceive  the  picture  as  pure  noise. 
However,  if  the  observers  are  told  that  the  image  consists 
of  a  dalmatian  dog,  most  will  identify  the  dog  in  the 
picture.  [Ref.  8] 


Figure  2.1:  This  picture  is  perceived  to  be  random  noise. 

Mention  'dalmatian  dog'  and  that  image  will  be 
seen.  [Ref.  8] 


Almost  all  segmentation  techniques  are  based  on  either 


the  concept  of  similarity  (e.g., 


characteristic  feature 


clustering)  or  discontinuity  (e.g.,  edge  detection).  These 
techniques  can  be  categorized  into  three  areas:  (1) 

characteristic  feature  thresholding  or  clustering,  (2)  edge 
detection,  and  (3)  region  extraction.  [Ref.  9]  These 
techniques  are  discussed  in  the  following  sections. 

B.  CHARACTERISTIC  FEATURE  THRESHOLDING 

Characteristic  feature  or  gray-level  thresholding  is  a 
widely  used  segmentation  technique.  The  general  idea  is  to 
divide  the  gray  scale  of  a  histogram  into  bands  of  a  similar 
characteristic,  e.g.,  gray  level.  In  general,  thresholding 
can  be  described  mathematically  as 

S(x,y)=k  if  Tk_1  <:  f(x,y)  <  T^,  k=l,2,...,m 
where  (x,y)  are  the  x-  and  y-coordinate  of  a  pixel;  S(x,y) 
is  the  segmented  function  of  (x,y);  are  the 

threshold  values  with  T^  being  the  minimum  and  Tm  being  the 
maximum;  m  is  the  total  number  of  distinct  bands  (or  labels) 
assigned  to  the  segmented  image.  The  selection  of  the 
threshold  value(s)  is  not  a  simple  task  and  can  be  dependent 
on  several  factors.  If  the  threshold  depends  only  on 
f(x,y),  the  gray  level,  it  is  called  a  'global  threshold'. 
If  the  value  is  dependent  on  f(x,y)  and  the  average  gray 
level  of  the  neighborhood  around  that  pixel,  it  is  called  a 
'local  threshold'.  If  the  threshold  is  based  on  the  gray 
level  f(x,y),  the  neighborhood  gray  level,  and  the 
coordinates  x  and  y  of  the  pixel,  it  is  called  a  'dynamic 


threshold'.  [Ref.  9]  As  can  be  seen,  the  selection  of  a 
threshold  value  is  not  an  easy  task,  but  the  selection  of 
the  threshold  is  very  important. 


There  are  several  methods  to  select  a  global  threshold. 
Some  are  based  on  the  gray  level  histogram,  others  on  local 
properties  such  as  the  gradient,  or  Laplacian  of  an  image, 
and  others  for  an  image  consisting  of  an  object  and 
background  where  the  percent  of  the  object  area  in  the  image 
is  known.  The  'mode  method'  is  a  technique  based  on  the 
gray  level  histogram  where  the  threshold  is  selected  in  the 
valley  between  the  peaks  (or  modes)  of  the  histogram.  This 
approach  has  the  advantage  that  it  reduces  the  probability 
of  misclassif ying  an  object  point  as  a  background  point  and 
vice  versa. 

However,  there  are  some  disadvantages  to  this  technique. 
Spatial  information  is  not  used  to  arrive  at  the  thresholds 
which  means  there  is  no  assurance  that  the  segmented  regions 
are  contiguous.  The  minimum  location  of  the  valley  may  be 
difficult  to  locate  since  the  valley  may  be  broad  and  flat. 
Methods  have  been  proposed  to  sharpen  the  peaks  to  more 
clearly  define  a  valley  bottom.  A.  Rosenfeld  [Ref.  10] 
proposed  an  iterative  method,  called  relaxation,  to  sharpen 
the  peaks  in  enhancing  images  and  their  histograms.  [Ref.  9] 
A  simple  example  of  a  bimodal  (two  peaks)  histogram  is 
shown  in  Figure  2.2.  The  objective  is  to  select  T  such  that 
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band  Bi  contains,  as  closely  as  possible,  levels  associated 
with  the  background,  while  B2  contains  levels  associated 
with  the  object(s).  Each  band  is  assigned  a  single  gray 
level  within  that  band  which  will  best  discriminate  the 
object  from  the  background.  This  figure  also  demonstrates 
the  case  of  a  broad  and  flat  valley,  where  many  of  the 
pixels  in  band  B2  are  not  part  the  object  but  may  be  noise, 
therefore  part  of  the  background.  The  iterative  method 
mentioned  above  is  a  possible  solution  to  enhancing  the  peak 
at  the  right  creating  a  truer  representation  of  the  object. 
Using  the  original  threshold  value,  errors  will  be 
introduced  into  the  scene  analysis  process,  which  is 
unacceptable  as  was  stated  earlier.  This  thesis  looks  at 
the  use  of  the  iterative  method  in  selecting  a  threshold  and 
creating  a  segmented  image. 


Figure  2.2:  Histogram  thresholding  [Ref.  6] 


C.  EDGE  DETECTION 


Edge  detection  is  an  image  segmentation  technique  based 
on  the  discontinuity  of  gray  levels  at  the  boundary  between 
different  objects.  This  discontinuity  can  be  any  one  of 
several  geometrical  forms: 

(1)  An  edge  -  The  gray  level  is  uniformly  consistent  in 
each  of  two  adjacent  regions,  and  changes  abruptly  at 
the  border  between  the  regions. 

(2)  A  line  or  curve  -  The  gray  level  of  a  thin  strip  in 
the  image  differs  from  the  two  regions  on  either  side 
of  the  strip. 

(3)  A  spot  -  The  gray  level  is  relatively  constant  except 
at  one  location  in  the  image.  This  looks  like  a 
spike  in  a  cross-sectional  view  (Figure  2.3),  but 
appears  as  a  spike  from  all  directions.  [Ref.  5] 
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Figure  2.3:  a)  Idealized  edge  cross  section, 
b)  Perfect  'spike'  line. 
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Edge  detection  schemes  consist  of  three  steps: 

(1)  The  use  of  a  gradient  or  derivative  operator  to 
detect  locations  where  the  gray  level  is  changing 
rapidly.  In  the  case  of  digital  images,  difference 
operators  are  used  instead  of  derivatives. 

(2)  A  threshold  operation  is  performed  on  the  gradient  in 
order  to  decide  if  an  edge  has  been  found.  The  edge 
points  are  assigned  a  value  greater  than  the 
background  if  the  gradient  is  larger  than  a  certain 
threshold.  This  threshold  selection  is  a  key  problem 
in  noisy  images.  Too  high  a  threshold  does  not 
permit  the  detection  of  subtle,  low-intensity  edges. 
A  value  too  low  causes  noise  to  be  detected  as  edges. 

(3)  Pixels  which  have  been  determined  to  be  edges  must 
then  be  linked  to  form  closed  curves  surrounding  the 
regions.  [Ref.  11] 

Edge  detection  is  of  limited  value  as  an  approach  to 
segmentation  of  noisy  remotely  sensed  images.  Often  the 
edges  have  gaps  at  places  where  the  transition  between 
regions  are  not  sufficiently  abrupt.  Additional  edges  may 
be  detected  at  points  that  are  not  part  of  region 
boundaries,  and  the  detected  edges  will  not  form  a  set  of 
closed,  connected  object  boundaries.  [Ref.  1] 


D.  REGION  EXTRACTION 


Another  way  of  doing  segmentation  is  to  divide  the  image 
into  regions.  Region  extraction  techniques  can  be  divided 
into  three  categories:  (1)  region  merging,  (2)  region 
splitting,  and  (3)  combination  of  region  merging  and 
splitting. 

Since  the  goal  of  segmentation  is  to  partition  an  image 
into  regions,  a  direct  approach  is  to  attempt  a  partitioning 
of  the  image  into  regions  which  satisfy  a  similarity 
criterion,  i.e.,  group  points  into  regions.  The  criteria 
which  can  be  used  in  extracting  objects  include  region 
homogeneity  (in  gray  level,  texture,  etc.)  and  contrast  with 
the  background,  strength  of  the  region's  edges,  size,  shape 
simplicity,  and  conformity  to  a  desired  texture  or  shape. 
The  advantage  of  this  approach  is  that  it  results  not  only 
in  boundary  point  of  regions  but  also  in  satisfying  a 
similarity  criterion  for  all  points  within  the  regions.  In 
order  to  group  points,  three  fundamental  issues  must  be 
resolved.  The  first  is  to  determine  the  number  of  regions. 
The  second  is  to  determine  some  properties  or  features  which 
distinguish  one  region  from  the  other  regions.  The  third  is 
to  specify  a  suitable  similarity  criterion  which  will 
produce  a  'meaningful'  segmentation.  A  'meaningful' 
segmentation  is  a  subjective  term  and  is  based  on  subjective 
methods.  [Ref.  6] 


One  method  is  called  region  growing.  This  approach 
starts  with  very  small  regions  with  uniform  pixel 
properties.  Growth  begins  by  starting  with  one  of  these 
regions  and  merging  neighboring  regions  with  it,  one  at  a 
time.  The  choice  of  which  neighbor  to  merge  will  depend  on 
both  the  similarity  of  the  regions  (based  on  gray  level, 
texture,  etc.  )  and  on  the  size  and  shape  of  the  resultant 
merged  region.  Because  of  the  sequential  operations 
involved,  the  process  is  slow. 

Another  approach  is  region  splitting.  This  approach 
considers  the  whole  image  as  a  single  region,  and  partitions 
it  by  repeated  splitting.  Two  simple  approaches  of 
subdividing  an  image  are  bisection  and  triangulation.  In 
bisection,  if  the  complete  image  is  not  homogeneous,  it  is 
divided  into  quadrants;  if  a  quadrant  is  not  homogeneous,  it 
is  divided  again  into  quadrants;  this  process  continues 
until  all  of  the  quadrants  are  homogeneous.  In 
triangulation,  the  image  is  divided  into  four  triangular 
sectors  which  meet  at  a  point  having  a  gray  level  farthest 
from  the  mean;  if  a  triangle  is  not  homogeneous,  it  is 
divided  into  four  triangles;  this  continues  in  a  similar 
manner  as  in  the  bisection  method.  There  are  two  serious 
problems  with  this  technique.  The  image  could  be  subdivided 
down  to  the  single  pixel  level,  which  is  probably 


unacceptable,  or  the  final  partition  may  contain  adjacent 
regions  with  identical  characteristics. 

A  method  which  is  preferable  to  either  merging  or 
splitting  is  the  combination  of  the  two,  or  the  merge-and- 
split  method.  The  general  idea  is  to  start  with  a  given 
initial  partition;  the  entire  image  is  a  region,  each  pixel 
or  a  small  block  of  pixels  is  a  region.  Adjacent  regions 
are  merged  if  the  new  region  is  sufficiently  homogeneous, 
and  a  region  will  be  split  if  it  is  not  considered  to  meet  a 
homogeneous  criteria.  [Ref.  5] 

One  of  the  disadvantages  of  region  merging  processes  is 
their  inherently  sequential  nature.  The  regions  produced 
depend  greatly  on  the  order  in  which  regions  are  merged 
together.  Most,  if  not  all  region  extraction  methods  rely 
heavily  on  local  information.  It  is  difficult  to 
incorporate  global  information  into  an  algorithm  unless  the 
category  of  pictures  to  be  processed  is  severely  limited. 
All  region  extraction  techniques  process  pictures  in  an 
iterative  manner  which  usually  involves  a  large  expenditure 
of  computational  time  and  memory. 

A  method  which  takes  advantage  of  both  parallel  and 
sequential  methods  is  called  relaxation.  'Parallel'  methods 
have  the  classification  decision  done  at  each  point 
independently  of  the  decisions  at  other  points. 
'Sequential'  methods  are  those  which  base  their  decision  on 


previous  decisions.  'Sequential'  methods  are  more  powerful 
than  'parallel'  methods  because  they  learn  to  better  define 


the  region  classification  as  they  proceed.  However, 
'sequential'  methods  are  slower  and  their  results  are  still 
dependent  on  the  order  in  which  the  points  are  processed. 
[Ref.  9] 

Relaxation  is  an  iterative  approach  which  makes 
probabilistic  classification  decisions  at  every  pixel  in 
parallel  at  each  iteration.  It  then  adjusts  these  decisions 
at  successive  iterations  based  on  the  decisions  made  at  the 
preceding  iteration  at  the  neighboring  points.  The 
relaxation  method  is  conducive  to  the  segmentation  problem 
in  noisy  infrared  images.  Noise  within  or  near  the  target 
will  be  filtered  out  due  to  the  sequential  process  involved 
when  the  probability  classification  of  the  noise  pixel  is 
adjusted  based  on  its  neighbors.  The  adjustment  of  the 
pixels  to  a  high  probability  ('light')  or  a  low  probability 
('dark')  will  enhance  the  peaks  in  the  histogram,  allowing 
for  an  easy  selection  of  a  threshold.  The  theory  for  this 
method  will  be  discussed  more  fully  in  the  next  chapter. 
[Ref.  5]  In  order  to  evaluate  the  usefulness  of  this 
method,  experiments  are  conducted  and  the  results  presented 


III.  SEGMENTATION  BY  THE  GRADIENT  RELAXATION  METHOD 


Segmentation  of  an  image  into  regions  can  be  done  by 
various  methods  described  in  the  previous  chapter.  These 
techniques  fall  into  several  categories:  region  merging, 
region  splitting,  and  a  combination  of  merging  and  splitting 
as  mentioned  before.  A  method  which  provides  for  an  easy 
selection  of  a  threshold  value  and  combines  the  advantages 
of  sequential  and  parallel  processing  techniques  is  the 
relaxation  technique.  This  chapter  will  discuss  the  theory 
behind  the  relaxation  technique  and  develop  the  mathematical 
relationships  used  in  the  gradient  relaxation  method,  the 
segmentation  technique  used  in  this  work. 

A.  INTRODUCTION  TO  RELAXATION  PROCESSES 

Relaxation,  or  iterative  methods,  were  originally 
developed  as  a  numerical  analysis  tool  to  solve  a  set  of 
simultaneous  equations.  In  recent  years,  relaxation  methods 
have  been  applied  to  image  analysis.  The  classification  of 
parts  in  an  image  using  relaxation  techniques  was  first 
introduced  by  A.  Rosenfeld  [Ref.  12]  and  S.  Zucker  [Ref. 
13].  These  methods  have  been  applied  to  histogram 
modification  (a  peak  enhancement  scheme),  noise  cleaning, 
edge  and  curve  detection,  curve  thinning,  angle  detection, 
template  matching,  and  region  labeling. 
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Image  analysis  usually  involves  the  discrimination  or 
classification  of  parts  within  an  image.  Classification  can 
be  based  on  gray  level  intensity  by  categorizing  points  as 
'light'  (object)  or  'dark'  (background),  or  vice  versa,  in 
the  segmented  infrared  images.  For  edge  or  non-edge  point 
classification,  it  is  based  on  some  local  property  (e.g., 
the  magnitude  of  the  gradient)  evaluated  at  that  point. 
Angles  on  a  curve  are  classified  based  on  the  magnitude  of 
the  curvature  of  the  curve  at  that  point.  Classification  of 
image  points  based  on  these  properties  is  error-prone, 
because  noise  in  the  image  may  cause  the  local  property  to 
be  misleading.  This  misclassif ication  can  be  compounded  if 
the  classification  is  done  in  a  'parallel'  fashion,  i.e., 
each  point  is  classified  without  reference  to  any 
classification  decisions  of  is  neighboring  points.  However, 
if  the  classification  procedure  has  sequential  operations, 
the  process  takes  advantage  of  previous  classification  of 
the  neighbor  points.  This  is  the  basis  of  the 
classification  of  objects  using  relaxation  methods.  The 
iterative  approach  has  two  advantages:  (1)  classification 

decisions  become  better  informed  as  the  analysis  proceeds 
and  (2)  the  method  can  use  fuzzy  or  probabilistic 
classifications  rather  than  making  firm  decisions 
immediately  as  would  be  the  case  in  a  parallel  process. 
[Ref.  10] 
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The  iterative  probabilistic  classification  method  can  be 


described  in  the  following  manner.  A  set  of  objects 
(points,  lines,  regions,  etc.)  AlfA2,...,AN  are  classified 
into  a  set  of  classes  A1#  X2,...,Xm.  Each  object  has  a 
neighbor  relation,  i.e.,  each  A^  has  a  specified  set  of  Aj's 
as  neighbors.  Each  object  A^  is  associated  with  a 
probability  vector  (Pn,  Pi2#-../Pim>  where  Pi^  is  an 
estimate  of  the  probability  that  A^  belongs  to  a  certain 
class  A  .  The  initial  probability  is  based  on  a 
conventional  type  of  analysis.  For  example,  a  point's 
probability  is  based  on  its  gray  level,  i.e.,  proportional 
to  the  distances  of  that  gray  level  to  the  maximum  values  of 
the  gray  level  range.  The  next  step  is  to  define  a  measure 
of  compatibility  between  an  object  A^  belonging  to  X h,  and 
another  object  Aj  belonging  to  X^.  If  there  is  a  high 
compatibility  (or  similarity)  between  object  A^  and  object 
Aj,  i.e.  (Aj_,  AjeAfc),  object  A^  is  reinforced  by  its 
neighbors.  Thus  its  probability  is  increased.  However,  if 


where  c(i,h,j,k)  is  the  compatibility  coefficient  between 
object  Ai  and  Aj,  with  values  between  [-1,1]  (low 
compatibility,  high  compatibility).  [Ref.  14] 

The  application  of  relaxation  techniques  to  segmentation 
involves  the  classification  of  pixels  into  'light'  and 
'dark'  classes.  The  initial  probabilities  of  each  pixel  in 
a  certain  class  is  based  on  its  gray  level,  i.e., 
proportional  to  the  distances  of  the  gray  level  to  the 
maximum  value  of  the  gray  level  range.  These  probabilities 
are  iteratively  adjusted  based  on  the  neighborhood 
probabilities,  with  'light'  reinforcing  'light'  and  'dark' 
reinforcing  'dark'.  This  is  the  basic  technique  used  in  the 
algorithm  which  will  be  discussed  in  the  following  section. 


B.  GRADIENT  RELAXATION  ALGORITHM 
1.  Gradient  Relaxation  Basics 


The  segmentation  technique  which  is  to  be  analyzed 
is  a  region  splitting  method  using  a  recursive  procedure  of 
the  two-class  relaxation  technique.  The  two-class  technique 
controls  the  segmentation  process  and  provides  for  an 
automatic  selection  of  a  threshold.  Normally,  in  the 
application  of  various  segmentation  techniques  based  on 
thresholding,  the  histogram  shows  two  or  more  peaks  in  at 
least  one  of  the  spectral  features  corresponding  to  various 
homogeneous  regions  of  an  image.  Very  often  preprocessing 
is  done  to  alter  the  histograms  and  local  properties  are 
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used  to  compute  the  local,  global,  or  dynamic  threshold. 
However,  if  the  intensity  histogram  of  the  image  is 
unimodal,  then  the  application  of  thresholding  techniques 
produces  a  poor  segmentation  and  does  not  establish  a 
criteria  for  automatic  threshold  selection.  A  unimodal 
distribution  is  typically  obtained  when  the  image  consists 
mostly  of  a  large  background  area  with  other  small  but 
significant  objects  (or  regions)  in  the  image.  For  example, 
in  the  case  of  a  complex  aerial  photographs  which  may  have 
many  objects  within  the  scene,  the  histogram  may  have  only 
one  broad  peak  because  the  restricted  range  of  intensities 
for  the  objects  is  probably  covered  by  the  background. 

2 .  Development  of  the  Gradient  Relaxation  Algorithm 

In  a  paper  by  B.  Bhanu  and  0.  Faugeras  [Ref.  15], 
they  proposed  a  gradient  relaxation  algorithm  for  the 
segmentation  of  images  having  an  unimodal  distribution. 
This  algorithm  is  based  on  the  use  of  inconsistency  and 
uncertainty  to  define  a  global  criterion  upon  the  set  of 
pixels.  Let  A^  and  A2  correspond  to  two  classes,  white 
(gray  level  =  255)  and  black  (gray  level  =  0),  respectively. 
'Inconsistency'  is  defined  as  the  difference  between  the 
probability  vector  =  [Pi(A]_),  Pj_(X2)],  and  the 
compatibility  vector  =  [Qi(A]_),  Qi(A2)]»  of  the  ith 
pixel.  In  other  words,  what  is  the  discrepancy  between  what 


every  pixel  'thinks'  about  its  own  labeling  and  what  its 


neighbors  'think'  about  that  labeling  (Qi).  'Uncertainty', 
is  measured  by  the  entropy  function  and  is  defined  to  be 


i  ;  i  i 

Hi(Pi(A1))  - - Pi(\1)ln  -  +  pi  ( A2  )  In  ~  ~  ~ 

ln2  Pi ( A 1 ^  Pi(X2) 


(3.1) 


A  criterion  is  defined  as 


C(Pi,P2,...,Pn)  =  £  Pi • Qi 

i=l 


(3.2) 


where  N  is  the  total  number  of  pixels  in  the  image.  The 


goal  is  to  maximize  this  criterion.  The  relaxation  process 


is  specified  by  choosing  a  model  of  interaction  between 


pixels  and  attach  to  each  pixel  i  the  set  Vi  of  its  eight 


nearest  neighbors.  The  idea  is  to  make  like  pixels 


reinforce  like  pixels  by  defining  a  compatibility  function 


c( i, Am, j, An)=0  m^n,  for  pixel  j  in  Vi  for  all  i 


c(i , Am, j , Am)=l  m=l,2  for  pixel  j  in  Vi  for  all  i  (3.3) 


where  i  ranges  from  1  to  N  pixels. 


The  compatibility  vector,  Qi ,  for  the  two  class  case 


is  then 


Qi ( Am ) =l/8  £  2  c(i,Am,j,An)Pj(An)  m® 1,2  ,  i— l,...,n 

jeVi  m=l 

(3.4) 
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i  .*>-  A-*  f  a  -J 


Substituting  for  c,  this  becomes  the  mean  neighborhood 


probability  of  the  ith  pixel  for  the  case  being  considered, 


Qi ( ^m ^ 

jeVi 


(3.5) 


The  choice  of  compatibility  function  in  (3.3)  will 


provide  the  desired  result  in  the  interior  of  the  region, 


but  along  the  edges  of  a  region  the  pixel  label  may  be 


uncertain  because  of  two  different  classes  of  neighbors. 


This  may  cause  distortion  at  the  boundary. 


The  maximization  of  the  criterion  (3.2)  means  that  a 


local  maximum  has  been  sought  that  is  close  to  the  initial 


labeling  k  .  The  maximum  criterion  is  achieved  by 


aligning  the  vectors  Pi  and  Qi  while  turning  them  into  unit 


vectors.  This  results  in  increasing  the  consistency 


(reducing  the  difference)  and  the  certainty  between  the 


vectors  Pi  and  Qi  while  turning  them  into  unit  vectors. 


This  results  in  increasing  the  consistency  (reducing  the 


difference)  and  the  certainty  between  the  vectors  Pi  and  Qi . 


It  is  easily  seen  from  the  definition  of  inconsistency  that 


the  minimum  occurs  when  Pi  =  Qi .  From  Figure  3.1,  the 


maximum  entropy,  or  high  uncertainty,  occurs  when  Pi(Am)  = 


0.5.  The  maximum  certainty  occurs  when  Pi(Am)  =  0.0  or  1.0, 


i.e..  Pi  =  [0,1]  or  [1,0],  a  unit  vector. 


The  uncertainty  definition  clearly  shows  that  the 
initial  assignment  of  probabilities  is  important  because  it 
affects  the  rate  of  convergence  and  the  final  results  of  the 
relaxation  process.  The  initial  probabilities  of  each  pixel 
is  defined  as 


PiUm)  =  I  ( i  )/G 


(3.6) 


H(p) 


Figure  3.1:  Entropy  Function  [Ref.  16] 

where  I(i)  is  the  intensity  of  pixel  i  in  the  range  0  <  I ( i ) 
<  G,  and  G  is  the  maximum  value  of  the  gray  levels.  This 


definition  disregards  any  a  priori  knowledge  that  may  be 


known  about  an  image.  However,  a  priori  knowledge  can  be 

included  in  the  initial  probabilities  by  estimating  the 

ratio  of  white  pixels,  Nw,  and  the  number  of  black  pixels, 

Nb.  This  ratio  is 

r  =  Nw/Nh 

--£*  j  pi (xi) 

i  i  pi(x2) 

1  (3.7) 

=  1/ ( G  -  I) 

where  I  is  the  mean  intensity  level  of  the  image.  By 
knowing  this,  the  distribution  of  gray  levels  can  be 
modified  so  as  to  make  the  ratio  r  closer  to  the  true  ratio, 
ro.  A  simple  way  to  do  this  is  to  define 

I " ( i )  =  (FACT)  (I(i)  -  I)  +  I  (3.8) 

O 

where  I(  is  a  desired  mean  and  FACT  is  a  parameter  which  can 
be  chosen  to  be 

FACT  -  1  for  I(i)  >  I 
0.7  <  FACT  <  1.0  for  I(i)  <  I 

Substituting  I'(i)  in  (3.8)  into  I(i)  in  (3.6), 

Pj.Ui)  =  ( FACT )  ( I  ( i  )  -  I)/G  +  1 0/G  (3.9) 


For  the  analysis  performed  in  this  thesis,  the  following 
values  were  used. 


FACT  =  1.0 


G  =  255 
I„/G  =0.5 

Pi(Xl)  =  { I ( i  )  -  I)/255  +  0.5  (3.10) 


When  the  first  term  of  (3.10)  is  greater  than  0.5  or 
less  than  -0.5,  then  a  value  of  1.0  or  0.0  will  be  assigned 
to  the  probability,  respectively.  [Ref.  17] 

The  gradient  of  the  criterion  is  obtained  from  (3.2) 


N 

C  =  I  Pi-Qi 
i=l 

=  pi  (  Xi  )Qi  ( Ai )  +  Pj.(A2)Qi(*2)  +  E  Pj’Q-i  +  E  Pk*Qk 

jeVi  k/Vi 


ac  ac 

vc  =  -  #  - - .  (3.11) 

3Pi(X].)  3Pi(X2) 


Solving  for  each  component  of  the  gradient,  we  have 


ac  a 

-  =  Qi(Ai)  + - E  Pj  *Qj  (3.12a) 

api(Xi)  aPiUi)  jeVi 

ac  a 

-  =  Qi(X2)  +  -  E  pj*Qj  (3.12b) 

3Pi(X  2)  3Pi(X2)  jeVi 


Looking  only  at  (3.12a)  and  taking  the  second  term  only,  we 


obtain 


-  I  P j  *Qj  =  E  -  Pj  .Q-s  +  z  pi.  - - .Q-; 

9Pi<Xl)  jEVi  j£VWXl>  '  j£Vi  \3Pi(iD 


The  first  term  is  zero  because  the  probabilities  of  the 
neighbor  pixels,  Pj's,  are  independent  of  the  probability  of 
pixel  i,  P^;  therefore 


-  E  Pj*Qj  -  £  P j  •  f  - -  Qj 

8pi(iDjeVi  jEVi 


3  3 

z  PjUl) - Qj(Xi)  +  Pj(A2) - Qj  (x2) 

jeVi  3PiUl)  3PiUi) 


(3.13) 


Recall  that  the  compatibility  function  (3.5)  is 


Q-i(Xfc)  =  1/®  ^  pm^k^ 

meVi 


where  Vj  is  the  set  of  neighbor  points  of  point  j,  of  which 
point  i  is  a  member  as  shown  in  Figure  3.2.  Taking  the 
partial  derivative,  we  find 


3  3 

- QjUl)  -  1/8  - Pi(*l> 

3PiUl)  3PiUi) 


(3.14a) 
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and  similarly. 


3  Pi ( A i ) 


Q j (X2  )  =  0 


Substituting  (3.14)  into  (3.13)  leads  to 


3Pi<Xl>  jeVi 


Pj.Qj 


1/8  E  P  j ( X i ) 
jeVi 


QiUl) 


Substituting  (3.15)  into  (3.12a),  results  in 


3C 


3Pi ( X i ) 


Qi(Xi)  +  Qi(Xr) 


2Qi ( X i ) 


Similarly,  the  second  component  is 


3C 


3Pi(X2) 


-  2Qi(X2) 


In  summary,  the  gradient  of  the  criterion,  C,  is 


VC 


3C 


3C 


SPiUi)  3Pid2) 
VC  »  j~2Qi  (  X I  )  ,  2Q<  (  X o~)  1 


3.14b) 


(3.15) 


[Ref.  18]  (3.16) 


Figure  3.2:  Set  of  pixels  Vf  and  Vj 


An  efficient  method  called  the  steepest  ascent 
technique  will  be  utilized  to  maximize  the  criterion.  This 
technique  begins  with  an  initial  probability,  P^0*, 
i  =  1,...,N  for  each  pixel  and  iteratively  adjusts  the 
probability  vector  to  converge  to  a  local  maximum  of 
criterion  (3.2).  This  is  achieved  by  defining  a  sequence 


U) 


as : 


(£+1) 


=  P 


U) 


+  p 


(£) 


PROJ 


U) 


GI 


(£) 


(3.17) 


( i )  .  to) 

where  p  is  a  positive  step  size,  the  vector  is  the 

gradient  of  the  function  to  be  maximized,  i.e.. 


U) 


=  VC 


3C 


3C 


3PiUl)  3PiU2> 


for  the  two  class  case 
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and  PROJ 


.  .  .  ( Z ) 

is  a  projection  operator  that  insures  that 

is  still  a  probability  vector.  [Ref.  17] 

Based  on  this  technique,  the  iteration  of  the 

initial  probabilities  Pj/°^  is  defined  as 

3C 

Pi(i+1)(X1)  =  Pi(l)(X1)  +  P(l)  PROJ(i)  - 

3pi(X!) 

3C 

Pi(£,+1,(X2)  =  Pi(i)(X2)  +  p(1)  PROJ(i,)  - 

9Pi(X2)  (3.18) 

where  pv  '  is  a  step  size  which  will  be  developed  later. 

A  method  discussed  by  J.  B.  Rosen  which  maximizes  a 
function  while  satisfying  a  constraint  or  constraints  is 
called  the  gradient  projection  method  [Ref.  19].  The 
constraint  for  this  case  is 

piU+1)(*l)  +  PiU+1)(X2)  =  1  (3.19) 

and 

Qi(Xi)  +  Qi ( X  2 )  =  1 
2Qi(X1)  +  2Q± ( X  2 )  =  2  (3.20) 

but  (3.20)  is  the  summation  of  the  components  of  the 
gradient  of  criterion  C,  (3.11)  and  (3.16), 

3C  9C 

-  +  -  =  2 

3Pi(Xi)  3Pi(X2)  (3.21) 
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The  projection  of  the  gradient  at  point  on  the  closed 


convex  region,  i.e.,  the  constraint  (3.19),  is  defined  as 


1  L 

PROJ*Gi  =  Gi  -  .1, (Gkv) 

L  K~l 


(3.22) 


where  PROJ  is  the  projection  operator,  L  is  the  number  of 
classes,  G  is  the  gradient  vector,  [Gi ,G2, . . . ,G2 ] ,  and  v  is 
[1,1, ...,1],  [Ref.  14]  This  is  shown  graphically  in  Figure 
3.3.  For  the  two  class  case, 


Gi  = 


3C 


3C 


3Pj.Ui)  3Pi(Xi) 


,  v  =  [1,1] 


and  the  projection  of  the  gradient  (3.22)  is  a  vector  with 
two  components.  Substituting  (3.16)  into  (3.22),  we  find 


PROJ 


(£) 


3C 


3PiUi) 


-  2Qi(A)  -0.5 


3C 


3C 


PROJ 


3C 


3Pi(X2) 


=  2Qi(X)  -0.5 


3Pi(Xji)  3Pi(X2j 

r  3c 


(3.23) 


3C 


jaPi  ( x  2 )  3Pi(x2)i 


( i ) 

During  each  iteration,  the  step  size  p£  is  normally  kept 
constant  and  is  the  largest  possible  value  such  that  after 
each  iteration,  the  probabilities.  Pi's,  remain  within  the 
constraint  of  Pi(Xi)  +  Pi(X2)  =  l,Pi(Xj<)  >  0.0,  k  =  1,2  for 
all  i . 
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.U)  _ 


1  -  PiU)(XD 


2Qi(X!)  -  1 
PiU)U1) 

1  -  2Qi ( X i ) 


if  Qi(Xi)  >  0.5 


,  if  Qi(Xi)  <  0.5 


(3.25) 


In  the  algorithm  which  was  used  in  this  thesis,  both  the 
rate  of  convergence  to  the  criterion  and  the  number  of 
pixels  assigned  to  each  class  was  controlled  by  setting  the 
step  size  to  the  following  values. 


p^l)  = 


cn  p.  ,  if  Qi  ( Xi )  >  0.5 
x  imax  A 


a2  p  ,  if  Qi(X2)  <  0.5 
imax 


(3.26) 


where  ax  and  02  are  constants  whose  values  are  less  than 
cone.  The  values  of  ax  and  02  are  weighting  factors  which 
will  bias  an  image  to  a  ::lass,  Xx  or  X2*  and  will  influence 
the  convergence  rate  of  the  criterion. 

Figures  3. 4 (a  to  c)  show  the  change  in  the  criterion 
as  the  number  or  iterations  increases  for  a  cell  image  which 
was  studied  in  the  noted  reference.  Each  figure  represents 
the  three  cases,  ax  =  02,  ax  <  02,  and  ax  >  02,  with  the 
parameter  FACT  =  1.0  in  all  cases.  These  figures  show  that 
by  increasing  the  weighting  factors,  the  rate  of  convergence 
will  increase  and  these  factors,  ax  and  a2,  will  also 
control  where  the  criterion  will  converge.  Thus,  the 


control  of  the  relaxation  process  can  be  done.  The 
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an  image  at  each  iteration.  Smoothing  is  defined  as  the 
elimination  of  a  small  region  or  regions  of  one  class  within 
a  much  larger  region  of  the  opposite  class.  As  the 
magnitude  of  each  factor,  and  02  increases,  the  smoothing 
effect  decreases.  This  will  be  demonstrated  in  the  next 
chapter.  Also,  the  ratio  of  and  02  controls  the  bias  of 
a  class.  Earlier,  the  parameter  FACT  was  set  equal  to  one. 
The  reason  for  this  is  shown  in  Figure  3.5.  The  effect  of 
this  parameter  on  the  value  of  the  criterion  and  the 
convergence  rate  of  the  criterion  is  seen  to  be  minimal 
[Refs.  15,  20] 

A  major  capability  of  this  process  is  to 
automatically  select  a  threshold  value.  This  is  a  key  task 
in  region  segmentation.  It  is  important  in  image  processing 
to  select  an  adequate  threshold  for  extracting  objects  from 
their  background.  In  the  ideal  case,  the  histogram  will 
have  a  deep  and  sharp  valley  between  two  peaks  representing 
the  object  and  the  background.  In  a  real  picture,  however, 
it  is  sometimes  difficult  to  detect  the  valley  bottom, 
especially  when  the  valley  is  flat  and  broad,  imbued  with 
noise,  or  when  the  peaks  have  extremely  unequal  heights 
producing  no  discernible  valley.  [Ref.  21]  In  the  case 
where  the  histogram  has  a  flat  and  broad  valley,  a  threshold 
selected  too  low  creates  an  object  (target)  which  maybe 
larger  than  it  actually  is,  or  if  the  threshold  is  selected 


too  large,  most  of  the  actual  target  maybe  segmented  into 
the  background. 

In  the  next  chapter,  it  will  be  shown  that  as  the 
number  of  iterations  increases,  the  peaks  in  the  histogram 
will  move  farther  apart,  and  the  average  brightness  will 
increase.  When  the  peaks  are  far  apart,  the  mean  value  of 
the  original  image  or  the  segmented  image  can  be  used  as  the 
threshold  value.  [Ref.  15]  This  is  why  the  gradient 
relaxation  is  advantageous  as  compared  to  other  methods  in 
segmenting  infrared  images. 


t 
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Figure  3.5:  Variation  of  the  criterion,  C  with  the 

iteration  number  for  3  values  of  FACT 
[Ref.  14] 


IV.  ANALYSIS  OF  IMAGES 


I  - 

A.  IMAGES  UNDER  ANALYSIS 

The  gradient  relaxation  segmentation  method  discussed  in 
the  previous  chapter  was  demonstrated  on  several  infrared 
images.  Ten  images  were  used  to  evaluate  the  performance  of 
this  segmentation  technique.  The  first  image  is  a  still 
photo  of  a  ship  with  low  contrast  (poor  visibility),  see 
Figure  4.1(a).  The  other  nine  images  were  obtained  from  an 
uncooled  focal  plane  infrared  sensor  (Figures  4.1(b)  -  (j)). 
The  sensor  was  placed  on  a  platform  on  which  the  sensor  was 
rotated  to  simulate  the  situation  of  a  rotating  missile. 
This  is  why  the  targets  are  seen  at  different  viewing 
angles.  The  images  were  recorded  on  video  disc.  Using  the 
EYECOM  digitizer,  individual  frames  were  extracted  from  the 
video  disk.  The  video  disc  contained  approximately  20 
minutes  of  video  data  of  several  ships  in  various  contrasts. 
The  scenes  contained  a  wide  variation  of  noise  within  the 
images.  Instead  of  attempting  to  analyze  all  of  the  frames 
(approximately  64,000  frames),  it  was  decided  to  select 
images  which  were  representative  of  most  of  the  frames  and 
situations  depicted  on  the  video  disc.  The  purpose  is  to 
determine  how  effective  the  relaxation  segmentation  method 
is  for  these  images. 


Figure  4.1(a):  A  ship  with  low  contrast. 

Three  criteria  were  used  in  the  selection  of  the  images: 
Find  images  where  the  target  stands  out  from  the 
background  and  is  not  degraded  significantly  by  noise. 
The  images  which  met  this  criteria  are  Figures  4.1(a), 

( c ) ,  and  ( d ) . 

Find  target  near  or  within  part  or  all  of  the 
background  noise  with  an  intensity  level  near  the 
intensity  level  of  the  target.  This  is  seen  in 
Figures  4.1(b),  (e),  (f),  and  (g). 

Collect  a  series  of  frames  as  the  target  is  rotating, 
showing  how  the  noise  changes  from  frame  to  frame. 
The  series  selected  includes  targets  near  noise  of 
similar  intensity  (see  Figures  4.1(g),  (i),  and  (j)). 

It  also  includes  a  target  which  because  of  the  noise 
is  fragmented  into  several  objects,  to  the  point  where 
the  target  itself  appears  to  be  background  noise  (see 


Figure  4.1(h)).  The  intention  is  to  see  if  the  target 
can  be  segmented  from  the  noise  background  well  enough 
to  be  able  to  detect  it  as  a  target. 

These  cases  obviously  do  not  account  for  all  situations,  but 
are  representative  of  the  noisy  infrared  images  which  were 
available  for  this  study. 

The  targeted  object  was  then  extracted  from  the  original 
512  by  512  image  to  form  a  smaller  64  by  256  image  which 
requires  much  less  time  to  process.  Figures  4.2(a)  -  (j) 
depicts  each  of  these  images  with  their  associated 
histograms. 

Noise  in  the  images  come  from  various  sources,  either 
natural  or  the  sensor.  Noise  sources  include  glare  off  the 
surface  of  the  water,  atmospheric  interference,  such  as 
scattering  and  attenuation  of  the  cloud  and  haze.  Thermal 
noise  is  introduced  since  the  sensor  is  uncooled. 
Transmission  noise  was  introduced  when  the  image  w  ; 
recorded  onto  the  video  disc  and  when  it  is  digitized  usi:  g 
the  EYECOM  digitizing  system. 

The  COMTAL  VISION  ONE/20  Image  Processing  System  was 
used  to  display  the  images  and  to  produce  the  associated 
histogram.  COMTAL  VISION  ONE/20  is  a  complete  image 
processing  system  with  built-in  interactive  processing  and 
control  capabilities.  The  system  produces  high  spatial 
resolution  video  images  over  a  range  of  256  gray  levels. 


(a)  Ship  with  low  contrast 


Figure  4.2:  Original  64  X  256  images  extracted  from  Figure  4.1 
images  with  their  gray-level  histogram 


(Figure  4.2  continued) 
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(i)  Ship  E  from  Figure  4.1 (i) 
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(j)  Ship  F  from  Figure  4.1(j) 


gray  level 


(Figure  4.2  continued) 
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The  distribution  of  the  pixels  over  the  gray  level  range  is 
completed  by  the  COMTAL  processor  in  the  following  manner. 
The  processor  counts  all  occurrences  of  each  gray  level  in 
the  image.  This  count  (the  total  number  of  pixels  at  each 
gray  level)  is  divided  by  the  highest  count  and  then 
multiplied  by  256.  This  number  is  subtracted  by  1  to  yield 
the  distribution  of  that  gray  level  in  the  figures.  The 
highest  normalized  count  is  always  255.  [Ref.  22] 

The  points  in  the  original  histograms  were  not 
connected.  To  provide  a  better  feeling  for  the  shape  of  the 
histogram,  it  was  decided  to  connect  those  points  which 
presented  a  general  outline  of  the  gray  level  distribution. 
The  points  selected  are  generally  the  highest  point  in  a 
selected  neighboring  group  of  points. 

The  histograms  of  each  of  these  images  generally  shows 
the  distribution  between  the  background  and  the  target.  In 
Figures  4.2(a)  and  (e)  -  (j),  it  is  possible  to  see  a 
separation  between  the  peak  background  level  and  the  peak 
target  level.  However,  in  each  of  these  cases  it  would  be 
difficult  to  select  a  threshold  value  which  could  be  used  to 
perform  an  effective  segmentation  as  discussed  in  Chapter 
II.  By  using  the  gradient  relaxation  technique,  the  problem 
of  determining  a  critical  threshold  value  is  easy. 

The  selection  of  the  weighing  factors,  Alphal  and 
Alpha2,  and  the  number  of  iterations  necessary  to  perform 
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the  segmentation  is  very  important,  as  mentioned  in  the 
previous  chapter.  The  selection  of  these  parameters  is 
influenced  by  the  detected  size  of  the  segmented  target,  the 
needed  accuracy  of  the  object  outline,  and  separation  of  the 
gray  level  peaks.  It  also  determines  how  quickly  the 
histogram  of  the  segmented  image  reaches  its  widest 
separation  of  the  gray  peak  levels.  This  was  also 
demonstrated  in  the  last  chapter.  The  following  parameters 
were  used  in  performing  the  experiments  on  the  segmented 
images : 

Alphal:  The  weighing  factor  on  pixels  with  gray 

levels  greater  than  the  mean. 

Alpha2:  The  weighing  factor  on  pixels  with  gray 

levels  less  than  the  mean. 

Iters  The  number  of  iterations  of  the 

relaxation  routine. 

Threshold  (THD):  The  threshold  value  is  used  to  determine 

which  pixels  will  be  part  of  the  labeled 
region.  Two  values  were  selected  in 
each  image.  The  first  value  of  220  was 
chosen  because  it  is  assumed  that  the 
higher  intensities  are  part  of  the 
target.  The  second  value  chosen  is  the 
mean  gray  level  intensity  of  the 
original  image. 

Region:  The  total  number  of  labeled  regions.  A 

labeled  region  is  a  grouping  of  pixels 
with  intensities  greater  than  the 
threshold,  THD. 

Area:  The  number  of  pixels  in  the  largest 

labeled  region. 


Perimeter : 


The  number  of  pixels  along  the  boundary 
of  the  largest  labeled  region. 
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Shape: 


This  is  a  measure  of  the  relationship 
between  the  area  and  the  perimeter  of 
the  largest  labeled  region.  It  is  equal 
to 

Shape  =  2*Area/Perimeter 

The  shape  is  small  for  narrow  objects. 
The  shape  is  large  for  rounded  objects. 

B.  APPLICATIONS  OF  THE  GRADIENT  RELAXATION  ROUTINE 

The  relaxation  routine  was  applied  to  each  of  the  images 
shown  in  Figure  4.2  and  are  separated  into  ten  separate 
cases.  The  criteria  used  in  the  analysis  is  as  follows: 

1.  Are  the  regions  uniform  and  homogeneous  with  respect 
to  a  gray  level? 

2.  Do  the  regions  contain  gaps  (holes) ,  and  if  so,  can 
successive  iterations  smooth  the  segmented  region? 

3.  Are  the  peaks  in  the  histograms  more  distinct? 

4.  Does  the  target  conform  to  a  desired  shape? 

5.  Is  a  target  detected? 

6.  Can  the  detected  object  be  used  in  the  classification 
process? 

The  general  format  of  the  experiment  entailed  applying 
different  values  of  the  values  Alphal  and  Alpha2  to  the 
images  for  several  iterations  and  to  observe  the  effect  on 
the  original  images.  The  values  were  subjectively  chosen  to 
test  for  the  cases  when  Alphal  =  Alpha2,  Alphal  <  Alpha2, 
and  Alphal  >  Alpha2.  The  maximum  number  of  iterations 
selected  was  based  on  the  theoretical  results  shown  in 
Figure  3.4  (Chapter  III).  These  figures  consistently  showed 
that  the  criterion  was  saturated  after  eight  or  moro 
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iterations . 


Using  more  iterations  would  not  have 


significantly  improved  the  segmentation. 

Each  case  includes  a  discussion  on  the  effect  of  the 
algorithm  on  that  image.  A  figure  of  the  segmented  image 
and  the  corresponding  histogram  are  shown.  Also  included 
is  a  table  summarizing  the  change  in  the  area,  perimeter, 
and  the  shape  of  the  segmented  region(s)  for  the  different 
settings  of  the  weighing  factors,  threshold,  and  number  of 
iterations. 

1 .  Ship  in  Low  Contrast  (Figure  4.2(a)) 

The  number  of  iterations  is  important  in  determining 
the  peak  gray  level  separation  of  the  background  and  the 
target.  Figures  4.3{a)-(d)  shows  how  each  iteration 
increases  this  separation.  In  the  original  image,  the 
separation  is  approximately  45  levels?  after  on  iteration  it 
is  almost  135  levels,  after  four  iterations  it  is  almost 
225,  and  after  eight  iterations,  the  separation  is 
approximately  250  levels. 

Four  cases  involving  different  Alphal,  Alpha2 
parameters  and  number  of  iterations  were  applied  to  the 
image  of  Figure  4.2(a).  Results  of  this  application  are 
seen  in  Figure  4.3  and  Table  4.1.  These  parameters 
determine  the  form  and  gray  level  intensity  of  the  segmented 
scene.  Setting  the  value  of  Alphal  2  Alpha2  increases  the 
apparent  size  of  the  target.  This  is  seen  in  Figures 


(a)  Alphal 


gray  level 


(Figure  4.3  continued) 


Lon 


(i)  Alphal  »  .1 
Alpha2  =  .4 
Iter  *  2 


gray  level 


(j)  Alphal 
Alpha2 
Iter 


gray  level 


(Figure  4.3  continued) 


TABLE  4.1 

QUANTITATIVE  RESULTS  OF  SHIP  WITH  LOW  CONTRAST 


ALPHAl 

ALPHA2 

ITER 

THD 

REGION  AREA 

PERIM 

SHAPE 

0.3 

0.3 

1 

220 

1 

808 

134 

12.06 

0.3 

0.3 

1 

82 

1 

15067 

607 

49.65 

0.3 

0.3 

2 

220 

1 

904 

131 

13.80 

0.3 

0.3 

2 

82 

1 

913 

131 

13.94 

0.3 

0.3 

4 

220 

1 

900 

131 

13.74 

0.3 

0.3 

4 

82 

1 

905 

131 

13.82 

0.3 

0.3 

8 

220 

1 

896 

131 

13.68 

0.3 

0.3 

8 

82 

1 

899 

131 

13.73 

0.6 

0.2 

2 

220 

1 

912 

131 

13.92 

0.6 

0.2 

2 

82 

4 

1438 

219 

13.13 

0.6 

0.2 

8 

220 

1 

1222 

146 

16.74 

0.6 

0.2 

8 

82 

1 

1222 

146 

16.74 

0.2 

0.6 

2 

220 

1 

720 

121 

11.90 

0.2 

0.6 

2 

82 

1 

737 

127 

11.61 

0.2 

0.6 

8 

220 

1 

541 

109 

9.93 

0.2 

0.6 

8 

82 

1 

547 

109 

10.04 

0.1 

0.4 

2 

220 

1 

657 

120 

10.95 

0.1 

0.4 

2 

82 

1 

912 

131 

13.92 

0.1 

0.4 

8 

220 

1 

467 

96 

9.73 

0.1 

0.4 

8 

82 

1 

506 

102 

9.92 

MEAN  = 

82 

4.3(a)- 

(£).  The 

resultant  image 

looks  more 

like  a 

tank,  not 

like  a 

ship.  By 

r  increasing 

the 

number  of 

iterations,  the 

region  grows  larger  as  defined  by  the  area.  However,  in  the 


cases  (Figures  4 . 3  {  e  )  -  (  j  )  )  where  Alphal  <  Alpha2,  the 
segmented  region  appears  to  be  closer  to  the  true  size  in 
the  original  image,  and  the  region  gets  smaller  as  the 
number  of  iterations  increase.  All  of  the  regions  in  each 
image  are  uniform  and  there  are  no  holes  within  the  regions. 
The  peaks  in  the  histogram  are  widely  separated  and 


distinct.  Results  in  the  table  shown  that  the  shape  becomes 


more  clearly  defined  with  more  iterations.  The  table  also 
shows  that  the  mean  is  a  reasonable  value  to  use  as  a 
threshold.  It  is  evident  from  the  result  that  this  type  of 
scene  does  allow  the  relaxation  routine  to  detect  a  target 
and  would  allow  for  the  possible  classification  of  the 
target  if  the  proper  weighing  factors  are  selected. 

2 .  Medium-size  Ship  (Figure  4.2(b)) 

Results  of  this  experiment  are  seen  in  Figure  4.4 
and  Table  4.2.  This  is  an  image  which  clearly  shows  the 
effect  of  the  number  of  iterations  imposed  on  establishing 
well  defined  peaks.  Figures  4.4(g)— (j)  display  the  effects 
on  the  same  image  with  one,  two,  four,  and  eight  iterations. 
After  one  iteration,  a  valley  between  the  peaks  is  better 
defined  than  the  original  histogram,  and  after  eight 
iterations  the  separation  is  near  a  maximum. 

The  weighing  factors  have  a  tremendous  affect  on  the 
segmented  regions.  Figures  4. 4 (a) -(d)  show  that  if  Alphal  £ 
Alpha2  the  region  increases  in  area  and  the  target  cannot  be 
detected.  In  the  case  where  Alphal  <  Alpha2  the  target  is 
detectable.  By  increasing  the  number  of  iterations,  the 
segmented  region  develops  into  a  form  which  can  be  neither 
detected  as  a  ship  nor  classified  as  a  ship  as  was  seen  in 
the  result  of  the  first  case.  Fewer  iterations  also 


produce  more  segmented  regions  (Table  4.2)  which  are  small. 


TABLE 

4.2 

|y 

QUANTITATIVE 

RESULTS 

OF  MEDIUM-SIZE 

SHIP 

H 

ALPHAl 

ALPHA2 

ITER 

THD  REGION 

AREA 

PERIM 

SHAPE 

0.3 

0.3 

2 

220 

2 

1428 

7.09 

0.3 

2 

128 

2 

1428 

403 

7.09 

* ' » 

0.3 

0.3 

8 

220 

3 

4025 

668 

12.05 

0.3 

0.3 

8 

128 

2 

4109 

732 

11.23 

i  0.6 

0.2 

2 

220 

4 

2209 

344 

12.84 

1  0.6 

0.2 

2 

128 

3 

2512 

475 

10.57 

‘XI 

!  0.6 

0.2 

8 

220 

5 

5584 

1075 

10.87 

Eg 

j  0.6 

0.2 

8 

128 

5 

5605 

1024 

10.95 

1  0.2 

G .  6 

2 

220 

4 

1154 

376 

6.14 

0.2 

0.6 

2 

128 

3 

2883 

513 

11.24 

0.2 

0.6 

8 

220 

1 

2643 

587 

9.01 

i  0.2 

0.6 

8 

128 

1 

2663 

553 

9.63 

m 

< 

■  0.1 

0.2 

1 

220 

1 

851 

338 

5.04 

m  o.i 

0.2 

1 

128 

2 

988 

289 

6.84 

J  0.1 

0.2 

2 

220 

3 

968 

396 

4.89 

0.1 

0.2 

2 

128 

2 

2732 

523 

10.45 

>  0.1 

0.4 

4 

220 

4 

1089 

404 

5.39 

m 

;  o.i 

0.4 

4 

128 

1 

2684 

561 

9.57 

0.1 

0.4 

8 

220 

2 

1437 

419 

6.86 

m 

;  o.i 

0.4 

8 

128 

2 

2133 

492 

8.67 

MEAN  =  128 

But,  increasing  the  number  of  iterations  there  are  fewer 
regions  resulting  and  these  regions  are  larger.  With  more 
iterations,  the  region  becomes  more  homogeneous,  but  still 
contains  gaps.  The  example  of  when  Alphal  >  Alpha2, 
demonstrates  how  noise  near  the  target  becomes  part  of  the 
target.  This  is  because  the  Alphal  weights  the  higher 
intensity,  thus  causing  the  growth.  This  is  a  good  choice 
of  why  Alphal  <  Alpha2  is  chosen.  It  confines  the  higher 
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(f)  Alphal  = 
Alpha2  = 
Iter  = 


gray  level 


(Figure  4.4  continued) 


gray  level  intensities  to  the  target  and  separates  it  from 
the  adjacent  noise. 

3 .  Sailboat  (Figure  4.2(c)) 

The  image  is  a  black  hot  inverted  infrared  image. 
For  the  relaxation  routine  to  work  properly,  the  object  to 
be  segmented  must  be  lighter  than  the  background. 
Therefore,  the  image  must  be  inverted  first.  Figures 
4.5{a)-(d)  show  results  of  segmenting  this  image.  This  image 
is  similar  to  the  first  case  (ship  of  low  contrast)  in  that 
it  provides  for  a  detectable  target  and  as  seen  in  Figures 
4.5(c)-(d),  it  could  be  classified  as  a  sailboat.  This  is 
more  readily  observed  in  Figure  4.5(c).  This  case  shows 
that  increasing  the  number  of  iterations  does  not 
necessarily  decrease  the  size  of  the  region  as  was  seen  in 
earlier  cases  (Table  4.3).  The  images  are  uniform  and 
homogeneous  and  contain  no  holes?  peaks  are  distinct,  sharp, 
and  widely  separated. 

4 .  Large  Ship  (Figure  4.2(d)) 

This  image  is  a  good  example  of  an  object  in  a  noisy 
background  which  can  be  segmented  into  an  image  which  is 
both  detectable  and  can  be  classified.  Figures  4.6(a)-(d) 
depict  the  effect  of  relaxation  on  this  image.  The  best 
results  are  seen  in  Figures  4.6(a)  and  (b)  where  the  gray 
level  peaks  are  clearly  defined  and  widely  separable.  These 


(b)  Alphal  =  .6 
Alpha2  —  .2 
Iter  =  8 


Figure  4.5:  Results  of 


TABLE  4.3 

QUANTITATIVE  RESULTS  ON  SAILBOAT 


ALPHAl 

ALPHA2 

ITER 

THD 

REGION 

AREA 

PERIM 

SHAPE 

0.6 

0.2 

2 

220 

1 

2309 

526 

8.78 

0.6 

0.2 

2 

108 

1 

2724 

296 

18.41 

0.6 

0.2 

8 

220 

1 

2883 

301 

19.16 

0.6 

0.2 

8 

108 

1 

2883 

301 

19.16 

0.1 

0.4 

2 

220 

1 

1051 

382 

5.50 

0.1 

0.4 

2 

108 

1 

2042 

234 

17.45 

0.1 

0.4 

8 

220 

1 

1488 

366 

8.13 

0.1 

0.4 

8 

108 

1 

1801 

217 

16.60 

MEAN  =108 

results  also  allow  for  the  easy  selection  of  a  threshold 
value.  This  case,  and  the  previous  cases,  have  demonstrated 
that  the  selection  of  a  threshold  to  determine  the  area  and 
the  size  of  the  region  can  be  chosen  as  the  mean  value  of 
the  original  image  without  significantly  changing  the 
measured  parameters.  The  images  are  uniform  and  homogeneous 
after  eight  iterations  in  each  case.  Gaps  are  seen  in  the 
first  iteration  (Figure  4.6(a)),  but  are  filled  in  after 
eight  iterations. 

5 .  Series  of  Frames  of  Single  Ship 
a.  Ship  A  (Figure  4.2(e)) 

This  is  the  first  of  a  series  of  six  images 
(Figures  4.2(e)  —  (j)  which  depicts  a  ship  at  various 
orientations  as  the  camera  is  rotating.  This  scene  clearly 
shows  the  separation  between  the  sky,  sea,  and  target.  The 


(a)  Alphal  =  .( 
Alpha2  = 


TABLE  4.4 

QUANTITATIVE  RESULTS  ON  LARGE  SHIP 


ALPHA1 

ALPHA2 

ITER 

THD 

REGION 

AREA 

PERIM 

SHAPE 

0.6 

0.2 

2 

220 

1 

3170 

430 

14.74 

0.6 

0.2 

2 

111 

1 

3450 

332 

20.78 

0.6 

0.2 

8 

220 

1 

3673 

308 

23.85 

0.6 

0.2 

8 

111 

1 

3679 

307 

23.97 

0.1 

0.4 

2 

220 

1 

248 

139 

3.57 

0.1 

0.4 

2 

111 

3 

2924 

294 

19.89 

0.1 

0.4 

8 

220 

1 

1687 

534 

6.32 

0.1 

0.4 

8 

111 

2 

2569 

260 

19.76 

MEAN  =  111 

histogram  shows  three  distinct  peaks  in  the  gray  levels  of 
Figure  4.2(e).  Figure  4.7(a)-(e)  shows  the  effect  of  the 
segmentation  on  this  image.  This  case  demonstrates  how  a 
high  threshold  and  few  iterations  will  segment  image  into 
several  regions.  When  Alphal  £  Alpha2,  the  target  and  the 
sky  merge  into  one  region  after  only  two  iterations.  This 
prevents  the  detection  and  classification  of  the  target . 
The  situation  becomes  worse  after  eight  iterations. 

In  the  case  where  Alphal  <  Alpha2  (Figures 
4.7(d)-(e)),  the  target  is  detectable  after  two  iterations, 
but  increasing  the  number  of  iterations  creates  the  sac* 
result  as  the  situation  mentioned  above;  the  sky  and  ♦>. 
target  merge  into  one  region.  In  this  case,  (see  F :  r?  •. 
4.2(e))  the  peak  associated  with  the  sky  and  the  ^  *  •  »■ 

greater  than  the  mean,  therefore  these  pixels 
together.  This  explains  why  these  two  irra-  -*  •• 
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TABLE  4.5 

QUANTITATIVE  RESULTS  OF  SHIP  A 


ALPHAl 

ALPHA2 

ITER 

THD 

REGION 

AREA 

PERIM 

SHAPE 

MiWcIIbS? 

0.3 

2 

220 

6 

49 

34 

2.88 

0.3 

0.3 

2 

120 

2 

6180 

663 

18.64 

0.3 

0.3 

8 

220 

1 

6139 

625 

19.64 

0.3 

0.3 

8 

120 

2 

6167 

616 

20.02 

0.6 

0.2 

2 

220 

2 

6012 

7054 

17.06 

0.6 

0.2 

2 

120 

1 

6935 

768 

18.06 

0.6 

0.2 

8 

220 

1 

7329 

706 

20.76 

0.6 

0.2 

8 

120 

1 

7366 

697 

21.14 

0.4 

0.1 

2 

220 

3 

3598 

1088 

6.61 

0.4 

0.1 

2 

120 

1 

7139 

844 

16.98 

0.4 

0.1 

8 

220 

1 

7475 

680 

21.99 

0.4 

0.1 

8 

120 

1 

7662 

635 

24.13 

0.2 

0.6 

2 

220 

1 

191 

116 

3.29 

0.2 

0.6 

2 

120 

1 

5488 

692 

15.86 

0.2 

0.6 

8 

220 

1 

5034 

688 

14.63 

0.2 

0.6 

8 

120 

1 

5069 

712 

14.24 

0.1 

0.4 

2 

220 

1 

698 

189 

7.39 

0.1 

0.4 

2 

120 

1 

5513 

703 

15.68 

0.1 

0.4 

8 

220 

5 

546 

366 

2.98 

0.1 

0.4 

8 

120 

1 

4749 

675 

14.07 

MEAN  =  120 


together.  If  the  pixels  associated  with  the  sky  had  been 
less  than  the  mean,  the  target  would  have  been  grouped  into 
its  own  region,  permitting  the  detection  and  possible 
classification  of  the  target.  In  general,  the  target  is 
uniform  and  homogeneous,  and  gaps  in  the  target  region  are 
eliminated.  However,  the  desired  shape  of  a  ship  does  not 
occur  with  more  iterations. 


b.  Ship  B  (Figure  4.2(f)) 

This  is  the  second  image  in  the  series.  The  sky 
is  to  the  left,  the  white  region  to  the  immediate  left  of 
the  target  is  caused  by  glare,  and  the  sea  is  to  the  right 
of  the  target.  This  effect  is  again  due  to  the  rotation  of 
the  sensor.  Of  the  series  of  images  seen  in  Figure  4.8(a)- 
(e),  Figure  4.8(e)  permits  for  the  detection  of  a  target  and 
its  orientation.  The  object  cannot  be  classified  in  any  of 
the  cases.  After  eight  iterations,  the  glare  and  the  ship 
merge  into  one  region  as  would  be  expected  based  on  the  two 
class  segmentation  scheme.  This  is  a  good  example  of  how 
noise  of  similar  intensity  near  or  contained  within  the 
target  can  become  merged  as  one  region.  This  reduces  the 
ability  to  classify  the  target.  Quantitative  results  are 
shown  in  Table  4.6. 

c.  Ship  C  (Figure  4.2(g)) 

This  image  is  similar  to  the  previous  case  in 
that  the  background  immediately  surrounding  the  object  has 
an  intensity  closely  matching  the  object  of  interest. 
Figure  4.9  shows  results  of  applying  the  relaxation 
segmentation  technique.  Figures  4.9(d)  and  (e)  shows  the 
cases  that  a  target  may  be  located  in  the  area,  or  that  a 
tremendous  amount  of  glare  from  light  reflected  off  the 


Iter  =  2 


Iter  *  8 


(a)  Alphal  =  .3,  Alpha2  *  .3 


(b)  Alphal  =  .6,  Alpha2  =  .2 


(c)  Alphal  *  .4,  Alpha2  *  .1 


(d)  Alphal  *  .2,  Alpha2  *  .6 


(e)  Alphal  *  .1,  Alpha2  -  .4 
Results  of  relaxation  segmentation  on  Ship  B 
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TABLE  4.6 

QUANTITATIVE  RESULTS  OF  SHIP  B 
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TABLE  4.7 

QUANTITATIVE  RESULTS  OF  SHIP  C 


ALPHA 1 

ALPHA2 

ITER 

THD 

REGION 

AREA 

PERIM 

SHAPE 

0.3 

0.3 

2 

220 

1 

2021 

483 

8.37 

0.3 

0.3 

2 

174 

1 

2466 

432 

11.42 

0.3 

0.3 

8 

220 

1 

2558 

370 

13.83 

0.3 

0.3 

8 

174 

1 

2563 

368 

13.93 

0.6 

0.2 

2 

220 

1 

2495 

462 

10.80 

0.6 

0.2 

2 

174 

2 

2917 

402 

14.51 

0.6 

0.2 

8 

220 

1 

3263 

425 

15.36 

0.6 

0.2 

8 

174 

1 

3265 

425 

15.36 

0.4 

0.1 

2 

220 

1 

2466 

481 

10.25 

0.4 

0.1 

2 

174 

1 

2944 

431 

13.66 

0.4 

0.1 

8 

220 

1 

3365 

437 

15.40 

0.4 

0.1 

8 

174 

1 

3462 

426 

16.25 

0.2 

0.6 

2 

220 

3 

1397 

450 

6.21 

0.2 

0.6 

2 

174 

1 

2133 

403 

10.59 

0.2 

0.6 

8 

220 

1 

1824 

334 

10.92 

0.2 

0.6 

8 

174 

1 

1826 

333 

10.97 

0.1 

0.4 

2 

220 

2 

967 

431 

4.49 

0.1 

0.4 

2 

174 

1 

2081 

431 

9.66 

0.1 

0.4 

8 

220 

1 

1567 

349 

8.98 

0.1 

0.4 

8 

174 

1 

1604 

335 

9.58 

MEAN  =  174 


increasing  the  number  of  iterations,  the  routine  produced 
only  fewer  and  smaller  regions.  Thus  the  algorithm  could 
not  provide  information  that  an  object  may  be  within  the 
frame  of  interest.  This  case  clearly  shows  that  the 
relaxation  method  fails  in  this  situation.  Quantitative 
results  are  shown  in  Table  4.8. 


e.  Ship  E  (Figure  4.2(i)) 

This  is  the  fifth  image  in  the  series  for  the 
same  ship.  Figure  4.11  and  Table  4.9  shows  the  results  of 
segmentation.  This  is  a  situation  where  a  possible  object 
may  be  in  this  frame.  This  case  demonstrates  that  by 
increasing  the  number  of  iterations,  the  segmented  region  is 
more  clearly  defined  by  eliminating  the  noise  near  the 
object  of  interest.  Also,  more  iterations  reduces  the 
number  of  segmented  regions.  The  target  is  detected  in  the 
case.  However,  it  does  not  allow  for  the  classification  of 
this  target. 


Figure  4.11s  Segmentation  of  Ship  E 
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TABLE  4.9 

QUANTITATIVE  RESULTS  OF  SHIP  E 


ALPHA 1  ALPHA2  ITER  THD  REGION  AREA  PERIM  SHAPE 


0.2 

0.6 

2 

220 

5 

241 

7 

8 

8 

0.2 

0.6 

2 

189 

5 

251 

8 

.6 

6 

0.2 

0.6 

8 

220 

1 

857 

190 

9 

E 

2 

0.2 

0.6 

8 

189 

1 

857 

9 

.0 

2 

0.1 

0.4 

2 

220 

4 

853 

278 

6 

.1 

4 

0.1 

0.4 

2 

189 

5 

249 

8 

.3 

7 

A  1 

A  A 

Q 

O  O  A 

CQQ 

1  Cl 

Q 

C 

U  .  1 

U  •  4 

O 

1 

077 

X  j  1 

D 

0.1 

0.4 

8 

189 

1 

727 

161 

9 

.0 

3 

f.  Ship  F  (Figure  4.2(3)) 

The  final  image  which  was  analyzed  shows  results 
(Figure  4.12  and  Table  4.10)  which  are  similar  to  hose  seen 
in  Figures  4.8  and  4.9.  The  glare  which  dominates  the  left 
side  of  the  ship  merges  into  the  same  region  of  the  ship 
after  only  two  iterations.  It  makes  classification 
impossible  and  greatly  reduces  the  possibility  of  detection. 
This  case  also  demonstrates  how  several  iterations  can 
reduce  the  size  of  the  segmented  region.  This  case  also 
demonstrates  that  by  having  a  lower  threshold  value  (i.e., 
the  mean),  there  are  fewer  segmented  regions  (1  versus  6,  or 
1  versus  3),  thus  enabling  an  observer  to  focus  on  the  one 
large  region. 


Iter  =  2 


Iter  =  8 


1  i 


(a)  Alphal  =  .2,  Alpha2  =  .6 


I  % 


(b)  Alphal  =  .1,  Alpha2  =  .4 


Figure  4.12:  Segmentation  of  Ship  F 


TABLE  4.10 

QUANTITATIVE  RESULTS  OF  SHIP  F 


ALPHAl 

ALPHA2 

ITER 

THD 

REGION 

AREA 

PERIM 

SHAPE 

0.2 

0.6 

2 

220 

6 

1222 

318 

7.69 

0.2 

0.6 

2 

171 

1 

2411 

437 

11.03 

0.2 

0.6 

8 

220 

1 

2055 

361 

11.39 

0.2 

0.6 

8 

171 

1 

2056 

361 

11.39 

0.1 

0.4 

2 

220 

3 

794 

280 

5.67 

0.1 

0.4 

2 

171 

1 

2341 

430 

10.89 

0.1 

0.4 

8 

220 

1 

1715 

324 

10.59 

0.1 

0.4 

8 

171 

1 

1797 

331 

10.86 

MEAN  =  171 
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C.  SUMMARY  OF  RESULTS 

The  results  show  that  for  these  cases,  where  the  target 
has  a  high  gray  level  and  contains  noise  due  to  the 
environment  and  the  sensor,  it  is  best  to  have  Alphal  < 
Alpha2.  This  reduces  the  chance  that  noise  which  ahs  gray 
levels  greater  than  the  mean  will  be  included  in  the  desired 
segmented  region.  Care  must  betaken  to  select  appropriate 
values  for  Alphal  and  Alpha2,  otherwise,  the  region  will 
become  so  small  that  the  object  of  interest  is  not 
classifiable.  The  target  may  still  be  detectable  however. 
The  result  will  provide  the  orientation  of  the  target. 

The  process  works  well  on  an  image  which  is  similar  to 
Figure  4.2(a).  The  peaks  in  the  histogram  are  clearly 
defined  and  are  sharp,  not  bell-shaped  as  in  the  case  of  the 
noisy  images  (Figures  4.1(b)— (j) ) .  The  noisy  images  can  be 
segmented  and  generally  identifiable  if  the  target  to  be 
segmented  is  approximately  ten  percent  or  more  of  the  frame 
of  interest  (Figures  4.2(b)-(e)).  However,  if  the  object 
occupies  less  than  three  percent  of  the  image  plane  (Figure 
4.2(h)),  it  is  difficult  or  impossible,  as  in  this  case,  to 
segment  it  by  this  method. 

The  segmented  region  in  all  cases  was  uniform, 
homogeneous,  and  any  holes  within  the  region  were  eliminated 
after  several  iterations.  These  are  all  desirable 
properties  of  in  a  segmentation  routine.  In  summary,  all 


but  one  of  the  cases  (Figure  4.2(h))  provided  for  the 
detection  of  a  possible  target,  and  four  of  the  test  images 
(Figures  4.3(a),  (b),  (c),  and  (d))  could  be  used  as  an 
input  to  a  classification  system. 


Image  segmentation  is  a  critical  step  in  the  image 
analysis  and  pattern  recognition  process.  Errors  which 
occur  at  this  step  may  propagate  through  additional  stages 
of  a  pattern  recognition  system  producing  an  incorrect 
description  of  the  scene.  The  gradient  relaxation  technique 
is  an  iterative  probability  adjustment  technique  that  can  be 
used  for  segmentation.  It  takes  advantage  of  both 
'parallel'  and  the  'sequential'  processing  methods.  The 
relaxation  approach  itself  has  two  major  advantages:  1)  the 
classification  decisions  become  better  informed  as  the 
analysis  proceeds,  and  2)  the  method  can  use  probabilistic 
classifications  rather  than  making  firm  decisions 
immediately. 

The  approach  is  conducive  to  the  segmentation  problem  of 
noisy  infrared  images  having  unimodal  distributions.  Noise 
near  or  within  the  target  will  be  filtered  out  because  each 
pixel's  probability  classification  is  adjusted  based  on  the 
probabilistic  classification  of  its  neighbors.  The  gradient 
relaxation  technique  maximizes  the  gray  level  intensity  of 
the  target  allowing  for  easier  detection.  The  weighting 
factors  must  be  chosen  carefully.  These  factors  are 
critical  in  determining  the  rate  of  convergence  (length  of 


time  to  maximize  the  intensities),  the  extent  that  noise  is 
eliminated  from  the  image  and  the  shape  of  the  segmented 
region.  The  technique  is  still  a  subjective  process  and  the 
ability  of  the  observer  to  set  the  proper  values  of  these 
factors  is  important. 

The  relaxation  method  is  an  ideal  technique  for  region 
extraction  because  of  its  ability  to  sharpen  the  peaks  in 
the  histogram,  create  homogeneous  and  uniform  regions,  and 
the  detected  target  conforms  well  to  its  original  shape, 
i.e.,  a  ship.  This  method  is  not  suitable  for  edge 
detection  of  objects  in  noisy  infrared  images.  The  noise 
causes  gaps  in  the  edges  at  places  where  the  transition 
between  regions  are  not  abrupt.  Additional  edges  may  be 
detected  at  points  that  are  not  part  of  the  region 
boundaries. 

Noisy  images  are  primarily  unimodal  making  the  selection 
of  a  threshold  difficult.  This  analysis  showed  that  the 
threshold  can  be  easily  selected  as  the  mean  gray  level 
intensity  of  the  image.  This  allows  for  precious 
computational  time  to  be  spent  for  segmentation  or  other 
image  processing,  instead  of  being  spent  to  search  for  a 
threshold  for  additional  image  analysis. 

The  technique  is  unable  to  separate  noise  of  similar 
gray-level  intensity  near  or  within  the  target.  This 


introduces  errors  into  the  image  segmentation  result,  making 


classification  of  the  target  difficult,  if  not  impossible. 
The  technique  fails  to  segment  targets  which  are  not 
contiguous  (i.e.,  broken  up  by  the  noise).  The  intended 
target  either  is  segmented  into  several  small  regions  or  (if 
the  intensity  level  of  the  noise  is  near  that  of  the  target) 
becomes  part  of  the  noise.  This  makes  detection  and 
classification  of  the  target  impossible. 

This  technique  could  be  implemented  in  hardware  as  part 
of  a  signal  processor.  By  implementing  the  technique  as 
part  of  the  processor,  the  requirements  for  an  infrared 
sensor  could  be  reduced.  Possible  requirements  which  would 
be  reduced  or  eliminated  include  signal-to-noise  ratio, 
detectivity  and  cooling  requirements  of  the  sensor,  the 
weight  and  power  for  the  system  would  possibly  be  reduced. 
Money  saved  in  the  cost  of  the  sensor  could  be  used  to 
enhance  the  computing  capabilities  of  the  signal  processor. 
Possible  applications  are  missiles,  remotely  piloted 
vehicles  (RPV's),  aircraft,  and  remote  sensors  aboard 
spacecraft. 

In  summary,  the  gradient  relaxation  technique  is  a 
viable  method  to  use  in  uncooled  infrared  sensors  to  detect 
targets.  The  ability  of  the  technique  to  eliminate  or 
reduce  noise  of  intensity  less  than  the  target,  thus 
enhancing  the  target  and  to  provide  for  detection  has  been 
shown.  The  technique  could  possibly  be  used  as  one  of  the 


inputs  of  a  classification  process  (i.e.,  shape  matching)  or 
classification  system,  but  only  for  those  images  where  the 
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intensity  of  the  target  is  greater  than  that  of  the  noise, 
or  where  the  target  has  large  spatial  separation  from  the 
noise  of  similar  intensity. 
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APPENDIX:  EXPERIMENTAL  PROCEDURE 


The  infrared  images  used  in  this  analysis  were  obtained 
from  an  infrared  uncooled  focal  plane  array  sensor.  The 
images  were  then  recorded  and  stored  on  a  video  disc.  Using 
the  EYECOM  digitizing  system,  individual  frames  were 
extracted  from  the  video  disc.  The  EYECOM  system  creates  an 
image  file  of  640  blocks  of  512  bytes.  This  file  must  be 
reduced  to  512  blocks  of  512  bytes  in  order  to  be  displayed 
on  the  COMTAL  image  processing  system.  This  file  was 
further  reduced  to  64  blocks  of  256  bytes  to  reduce  the 
processing  time. 

The  measurements  made  in  Chapter  IV  of  the  area  and 
perimeter  were  obtained  by  calling  subroutines  in  the 
Subroutine  Package  for  Image  Data  Enhancement  and 
Recognition  (SPIDER)  image  processing  package.  The  routines 
which  were  used  are: 

1.  CLAB  -  The  routine  assigns  labels  (serial  numbers) 
each  segmented  region.  Each  pixel  in  a  region  is 
assigned  a  label.  This  routine  produces  a  labeled 
image . 

2.  AREA1  -  The  routine  counts  the  number  of  pixels  within 
every  region  in  a  labeled  image. 

3.  PRMT1  -  This  routine  measures  the  perimeter  of  every 
region  in  a  labeled  image.  [Ref.  23] 
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